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Boolean Networks and their dynamics are of great interest as abstract modeling schemes in various 
disciplines, ranging from biology to computer science. Whereas parallel update schemes have been 
studied extensively in past years, the level of understanding of asynchronous updates schemes is 
still very poor. In this paper we study the propagation of external information given by regulatory 
input variables into a random Boolean network. We compute both analytically and numerically 
the time evolution and the asymptotic behavior of this propagation of external regulation (PER). 
In particular, this allows us to identify variables which are completely determined by this external 
information. All those variables in the network which are not directly fixed by PER form a core which 
contains in particular all non-trivial feedback loops. We design a message-passing approach allowing 
to characterize the statistical properties of these cores in dependence of the Boolean network and 

S the external condition. At the end we establish a link between PER dynamics and the full random 
asynchronous dynamics of a Boolean network. 

a 
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The main motivation for this work is to study the propagation of external information given by 
regulating but non-regulated variables into random Boolean networks. This process, called propagation 
of external regulation, eventually stops due to one of two reasons: Either the external information has 
propagated throughout the full network, or a core of variables cannot be fixed by PER. The statistical 
properties of these cores are determined by the ratio a between the numbers of Boolean constraints and 
of variables, and by the composition of the constraints. In this work we will embed the propagation 
dynamics of the external condition into the asynchronous update dynamics by introducing ternary 
variables of values {0,1,*}. The supplementary joker value * indicates that a variable has not been 
fixed to a Boolean value in the PER dynamics. The introduction of this new value * allow us to 
analyze the PER dynamics in terms of a new constraint satisfaction problem with the same topology 
of the original one, but where Boolean constraints are extended in a natural way in order to include 
this ternary representation. This observation allows us to apply recently developed tools from the 
statistical-mechanics approach to combinatorial optimization problems. 



PACS numbers: 
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I. INTRODUCTION 



Boolean Networks (BNs) are dynamical models originally introduced by S. Kauffman in the late 60s Since 
Kauffman's seminal work, they have been used as abstract modeling schemes in many different fields, including cell 
differentiation, immune response, evolution, and gene-regulatory networks (for an introductory review see and 
references therein). In recent days, BNs have received a renewed attention as a powerful scheme of data analysis and 
modeling of high-throughput genomic and proteomic experiments [3( . Considerable attention has been given in the 
previous literature to the classification of different attractor types present in BNs under deterministic parallel update 
dynamics [H, 0, [H, [|| . A special relevance has been attributed to the so-called critical BNs Q situated at the transition 
between an ordered and a chaotic regime. 

The original dynamical problem can been cast into a constraint satisfaction problem [H, Q . Following this approach 
presented in [13, EL E3 and partially also in it is possible to study the organization of fixed points in the 
thermodynamic limit in random Boolean networks. This leads to the identification of a transition characterized by 
the sudden emergence of a computational core. Its existence is necessary but not sufficient for a globally complex 
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phase to exist where fixed points are organized in an exponential number of macroscopically separated clusters. This 
phenomenon is found to be robust with respect to the choice of the Boolean functions. It is missing only in networks 
where all Boolean functions are of AND or OR type. The size of the complex regulatory phase is found to grow with 
the number K of inputs of the Boolean functions. 

The organization of the paper is the following: in Sec. [TT] we introduce the model focusing to Kauffman models 
with K — 2 inputs per Boolean function, in Sec. IIIII we introduce the notion of propagation of external regulation, 
the message-passing algorithm we will use for analyzing the problem. We also discuss in details the resulting phase 
diagram of the model. In Sec. IIVI we will give some analytical prediction on the random asynchronous dynamics of 
random Boolean networks, and its relation to the PER dynamics. Conclusions are drawn in Sec. [V] 



II. THE MODEL 

Let us first define properly the model we are going to investigate. It is formed by N Boolean variables collected in a 
vector s — (si, . . . , sn) S {0, 1}^; in Fig.Q]they are represented by circles. They are constrained by Boolean functions 
Fa(s ai , s a2 ) with a S INT C {1,...,N} running over M = |/iVT| different variables, each depending on two input 
variables. The generalization to more than two inputs is obvious, but in this work we will concentrate fully on the 
two- input case. These functions are represented by squares in Fig. [1] within the so-called factor-graph representation 
ofaBN. 

The functions F a allow to define a random asynchronous update dynamics, cf. [Til . Il5| : In each step an element a 
of INT is selected randomly and updated according to its regulating Boolean function 



„T+1 



Fa(s ai , S aK ) 



(1) 



with s T denoting the configuration after T time steps. This time step is repeated, after M updates every function is 
selected on average once. 




FIG. 1: Factor graph representation of a small Boolean network: circles symbolize variables, squares Boolean functions. s\ is 
an example for an external input variable, S4 for a functional variable, whereas S3 stands for a regulatory variable. 



In this work, we concentrate our attention on random factor graphs subject to two conditions: 

(a) Function nodes F a have fixed in-degree 2 and out-degree one {i.e., two input and one output variable). 

(b) Variables s a have in-degree at most one, i.e., either they depend on one single function, or they are not regulated 
by any function. 

Setting a := M/N, the degree distribution of variable nodes approaches asymptotically 



P° ut (d out ) = 

u-out : 

p m (d in ) = aS din ,i + (1 - a)5 dia:0 



(2) 
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i.e., the out-degree distribution is a Poissonian of mean Ka, while in-degree distribution is bimodal. Since in- and 
out-degrees are uncorrelated, the joint degree distribution factorizes: p(d ut, ^in) = p ont (d ou t) p ln (d m ) . Random factor 
graphs are obviously a drastic oversimplification of realistic models of gene-regulatory networks: There available 
data show evidence for a broad, possibly scale-free p out (d ut) and a more concentrated in-degree distribution being 
compatible with an exponential form [2(J, Hl[ . Nevertheless, this simplified model allows for many detailed analytic 
predictions that can guide our comprehension in more realistic and interesting cases, and is a well-controlled testing 
ground for techniques borrowed from statistical mechanics. Such techniques are not limited to random graphs and 
can be easily extended to deal with more realistic cases. 
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FIG. 2: Schematic representation of the three types of variables: external input, functional, and regulatory. 



In general, we can distinguish three sets of variables as displayed in Fig. [2] 

• External input variables: There are N — AI — (1 — a)N variables which are not regulated by any function, 
they are collected in the set EXT — {1, .., N} \ INT. They represent external inputs to the network. We do 
not consider these external inputs as dynamical variables of the system. In the original definition of Kauffman 
networks they are in fact included into the definition of the BN itself: This definition works at a = 1 but allows 
for constant functions, whose outputs are the analogues of our external input variables. 

• Regulatory variables are all those variables which arc regulated and regulate. 

• Functional variables: There are e~ 2a N variables which do not regulate any other function. 

The last both types form together the set INT of internal, regulated variables. One has to note that in a random 
network there are some variables which belong theoretically to both the external and the functional variables because 
they are isolated. They contribute only trivially to the behavior of the network and can therefore be neglected in the 
discussion. 
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TABLE I: Truth table for all 16 boolean functions of if = 2 inputs. 



We have to specify the functions acting on top of the random topology defined so far. There are 2 2 = 16 Boolean 
functions, which can be grouped into 4 classes [lq : 

(i) The two constant functions. 

(ii) Four functions depending only on one of the two inputs, i.e. si, si, S2, S2- 

(hi) Canalizing functions (AND-OR class): There are eight functions, which are given by the logical AND or 
OR of the two input variables, or of their negations. These functions are canalizing. If, e.g., in the case 
F(si, S2) = si A S2 the value of s\ is set to zero, the output is fixed to zero independently of the value of s%. It 
is said that si is a canalizing variable of F with the canalizing value zero. 

(iv) Non- canalizing functions (XOR class): The last two functions are the XOR of the two inputs, and its negation. 
These two functions are not canalizing, whatever input is changed, the output changes too. 
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We keep in mind that the case of Boolean functions depending on exactly K = 2 input variables is just the simplest 
interesting case. Since the number of Boolean functions of K variables increases as 2 2 , a complete classification of 
Boolean function becomes intractable already for relatively small K . For clarity we therefore concentrate first on true 
(K = 2)-hmctions only, i.e. on those in the canalizing AND-OR class and the non-canalizing XOR class. We therefore 
require xM functions to be in the XOR class, and the remaining (1 — x)M functions to be of the AND-OR type, with 
< x < 1 being a free model parameter. In this sense, for K = 2, the network ensemble is completely defined by 
a and x. It is interesting to note that the case x = 1 on a slightly different class of random hypergraphs, has been 
already studied in a different context in [l7j |. 



III. PROPAGATION OF EXTERNAL REGULATION 



An important dynamical process in BN is the propagation of the external condition given by the non-regulated 
variables into the network. Imagine, e.g., that all inputs of a Boolean function are external variables. Fixing the 
external condition, also the output of the considered Boolean function is fixed, the external condition has been 
propagated. Iterating this propagation, we may eventually also fix variables which do not depend directly on external 
variables, but whose inputs have been fixed in an earlier iteration step, cf. Fig. [3l Note that in the case of canalizing 
functions it is sufficient to have one input variable fixed to its canalizing value in order to be able to propagate the 
information. Thus, PER is more efficient in the case of a small fraction x of non-canalizing, XOR- type functions in 
the network. 

This propagation of external regulation may stop in two ways: First, all variables might be fixed, and the external 
information is propagated throughout the full network. Second, a core of functions survives which still have unfixed 
output. This PER core contains in particular all relevant feed-back loops which are not broken due to the inclusion 
of canalizing functions. 




FIG. 3: PER: Both inputs to the upper left Boolean function are external, so also the output is directly fixed. Once this is 
done, also the inputs to the second function are fixed, and again the information can be propagated. All variables in this small 
sample graph are therefore determined by PER, no core exists. 

The propagation dynamics of the external condition can be embedded into the asynchronous update dynamics of 
Eq. ([1]) by generalizing the Boolean variables to ternary variables, and the Boolean functions accordingly. In addition 
to the values and 1 we introduce the joker value * for variables having no fixed value. As an initial condition for 
PER, only the external variables have assigned Boolean values, whereas all nodes belonging to INT are assigned 
value *. Generalizing the functions according to Tab. [Til the dynamics of PER is given exactly by Eq. (p}, with a 
randomly selected a £ INT for each single-variable update. 

Note that, the fraction of variables reached by PER depends not only on the network but also on the precise 
external condition. In a previous publication we have analyzed the average case which is realized by the vast 
majority of external conditions. Here we are aiming at a more precise description of fluctuations of the PER core 
size due to various external conditions. Note that such fluctuations are completely due to the existence of canalizing 
functions which may or may not propagate the information of a single fixed input, in dependence on whether it has 
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TABLE II: Extended truth table for canalizing and non-canalizing functions of K = 2 inputs. 

its canalizing value or not. For x — 1 we thus expect no such fluctuations to exist, and the PER core is completely 
topological in the sense that it depends only on the underlying network and not the actual realization of the functions. 

A. Belief propagation for the ensemble of PER cores 

The algorithmic determination of the PER core for a single external condition can be achieved in linear time 
by simply following the definition of the process itself. If we want to characterize the statistical properties of the 
ensemble of all cores resulting from 2^~ a ' N external conditions, running times obviously become exponential and 
thus not feasible for large networks. 

The introduction of the joker value allows us to set up a directed belief-propagation algorithm [22|, [HI, [24| which may 
be used to efficiently describe the statistical properties of all regulated variables over all external conditions. Some 
interesting limiting cases will be discussed separately: The appearance of the first PER cores in an exponentially 
small fractions of all external conditions, the appearance of the PER core for a typical external condition, and the 
appearance of a non-zero overlap of all PER cores. 

As already mentioned, we are interested in the behavior for the full ensemble of external configurations. We therefore 
introduce the single-site distributions 

Pi (s)=2-^ N Y, ( 3 ) 

ext. cond. 

as the histogram of the value of variable Sj over all external conditions, after completion of PER. In this notation, an 
external variable obviously has 

Pi(s) = |tf»,o + ~«.,i> it EXT (4) 

since by definition it never takes value *, whereas internal variable might have a non-trivial weight also in Pi(*). 
Under PER, these probabilities get updated as 

Pi( s i) = Y 5 «i,*i(*j,*iO Pj( s i) Pk( s k) , (5) 

following the output of the extended Boolean function summed over all possible input configurations. In this equation, 
variable Sj is understood to be regulated by Sj and s^. These equations can be understood as a directed Belief 
propagation (BP). Note that the BP equations contain a very important assumption: The probability of the two 
regulating variables is factorized, i.e., they are assumed to be statistically independent. This assumption is expected 
to become asymptotically exact in the thermodynamic limit N — > oo of infinitely large random Boolean networks. 
In finite networks, it might be violated due to short loops: The input variables Sj and Sj might, e.g., depend on 
two Boolean functions with a common input and thus be correlated. More generally correlations between two input 
variables are introduced whenever they have some common ancestor. In random network such ancestors are known to 
typically have a distance of C(ln N) , and thus become irrelevant in the case of very large BNs (inside on thermodynamic 
state). 
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The update can be achieved iteratively, starting from the initial condition 



for i e EXT 
for i e INT 



(6) 



which reflects the fact that initially only the external variables are fixed, and the regulated ones are still unknown. 

Note that the original PER is included in Eq. ([5|) if the external condition is polarized completely to a single 
configuration. In this moment, all marginal probabilities remain polarized into one of the three possible values, and 
factorization in Eq. ([5]) becomes trivially fulfilled. Under this initial condition, BP would be exact on whatever 
Boolean network. 

Since BP equations are defined on a single graph, their applicability is not necessarily restricted to random BNs 
with their Poissonian outdegree distribution. In the latter case one can, however, introduce a simple average over the 
graph ensemble by introducing the histogram of the Pi(-) for all internal variables: 



V ^ = ^N £ S[ P (-)- P% (-)} 



(7) 



ieiNT 



with 6['] being a three-dimensional Dirac distribution in all components of pi. For a randomly chosen regulated 
variable, each of the inputs is regulated with probability a, and external else. So we can write an effective self- 
consistent equation for P, 



V[ P (-)} = V Pl Vp 2 Q[ Pl (-), a ]Q[p 2 (-),a} 



where 



Q[p(-),a} = aV[p(-)] + (l-a)6 



(8) 



(9) 



In this equation, each integration runs over a three-dimensional simplex, and the average (-)p is taken over the 
distribution of Boolean functions at given fraction x of non-canalizing functions. It can be solved iteratively by 
standard methods as population dynamics, but some limiting cases can be discussed analytically. 



B. The appearance of the first PER core 



For small a, only few regulated variables exist in comparison to the number of external variables. As already shown 
in [1(1 [H| , for a < 1/(1 + x) typically all variables can be fixed by PER starting from the external condition. Typically 
here means that, starting from a randomly chosen external configuration and with probability approaching one in the 
thermodynamic limit, no extensive PER core remains. 

On the other hand, one should expect that an exponentially small fraction of external conditions already leads to a 
non-zero PER core at smaller a. This happens if external variables are chosen to take as rarely as possible canalizing 
values of the Boolean functions depending on them. 

The argument can be made mathematically more stringent. For doing so, we first observe that the existence of a 
PER core for some external configurations leads to a finite fraction of variables i with pi(*) > 0. 

We therefore introduce the fractions of regulated variables never (resp. always) taking a certain value s out of 
{(),*, 1} 

t s = T[ P (s) = 0] = — S Pi(s),o > 

ieiNT 

= V[ P (s) = 1] = -L £ 5 Pi(s)A . (10) 
ieiNT 

The existence of an extensive core for some of external conditions is obviously equivalent to t 4 < 1. Note that in our 
unbiased ensemble of Boolean functions none of the values zero or one is favored. We thus expect r = T\ and tt = m 
for symmetry reasons. 
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1. Fixed-point equations 



We can use the BP equations to get a finite closed set of equations including r*. Technically this means to project 
Eq. 1(5)), which is formulated as an equation for functions over a three-dimensional simplex, to a finite number of 
variables. To achieve this, we have to discuss carefully all those cases on the right-hand side of Eq. ijHJ) which lead to 
a vanishing probability for s — * on the left-hand side. 

• For non- canalizing functions, a non-zero probability of a ^-valued output exists whenever one of the inputs 
is allowed to assume also value To contribute to r*, both inputs have to be non-*, which happens with 
probability (1 — a + ar*) 2 . 

• For canalizing functions, the situation is a bit more involved. The output is prevented from taking value * also 
if one of the inputs is frozen completely to its canalizing value, which (due to the symmetry discussed above) 
happens with probability aito — cm\. To avoid double counting with the previously discussed case, the second 
input than has to have a non-zero probability in *, which happens with a(l — t*) 

Putting the two cases together, we find 

r* = x(l - a + ar±) 2 + (1 - x)[(l - a + err*) 2 + 2a 2 7r (l - r*)] 

= (I -a + otT it f + 2(1 - x)a 2 TT (l - t*) . (11) 

This equation still depends on the probability no — 7Ti that a variable is frozen to one of the values or 1. By 
definition, this can happen only to regulated variables, but even there only if either both inputs are frozen to a fixed 
value from {0, 1} or if one input of a canalizing function is frozen to its canalizing value. In this sense, frozen variables 
are generated only by other frozen variables. On the other hand, the initial condition of Eq. ^ does not contain such 
frozen values, and under iteration of Eq. © they cannot be generated spontaneously. We thus conclude 

7T0 = 7T1 = , (12) 

and Eq. (fTTj) reduces to 

r* = (1-a + ar*) 2 . (13) 

This equation does not depend on the fraction x of canalizing functions any more. It has two solutions, a trivial one 
relevant for small a, a non-trivial one for higher a. More precisely we find 

f 1 if a < I 
{ lf a > 5 

Note that also the solution r* = 1 is physically sensible beyond a = 1. The derivation of Eq. (JTTJ) requires only 
consistency of the generalized Boolean functions, which is obviously also fulfilled for any fixed point of the BN 
containing no ^-valued variables. The correct solution for the PER core is given by both the consistency of functions 
and the maximality of the number of *, which results from the fact that only those internal variables are changed 
from ★ to or 1 which are necessary for fulfilling the generalized Boolean constraints. This will become more clear in 
in next subsection. 

The result is quite interesting. Independently of the fraction of canalizing Boolean functions, the PER cores for 
some external conditions start to exist as soon as a exceeds 1/2. Note that also the fraction of concerned variables 
does not depend on x. According to the above discussion that the PER core for a BN without canalizing functions 
is uniquely determined and of purely topological nature, the same is true for the set of variables which are contained 
at least in one PER core. This set does not depend on the choice of functions, but only on the underlying graph 
structure. 

In Fig. [4] we compare the analytical result to the BP result on single BNs of N = 10000 vertices, for various values 
of a. We plot the fraction 1 — r* of all those internal vertices which belong to the PER core for at least one external 
configuration. Besides the good coincidence between analytical results (derived in the thermodynamic limit) and 
single-sample data, one point is remarkable: The BP curves for various a;-values are practically identical. They come 
from identical graphs with different realizations of the functions, showing thus that the union of the PER cores for 
all external conditions is a topological object of the underlying graph. 
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FIG. 4: Fraction of internal variables being in at least one PER core: Analytical result (full line) versus BP results on single 
graphs of N = 10000 vertices. 



2. Dynamics 



To close this subsection, we discuss shortly the dynamics of r* (i) as a time dependent quantity. Let us assume a 
random asynchronous update. Then the expected number of sites contributing to r* after T single-variable updates 
is given by 

Mr*(T + 1) = Mr* (T) - 7*(T) + (1 - a + ar*(T)) 2 . (15) 

On the right-hand side, we have three contributions: the number of contributing sites after step T, the probability 
that the randomly selected internal site already contributes, and the probability that the inputs to this site at step 
T force the variable to contribute at step T + 1. Note that we have already canceled the constantly vanishing no 
contribution in Eq. (lllj) . In the thermodynamic limit we rescale the time as t = T/M, and the above equation 
becomes an ordinary differential equation, 

f* = -t* + (1 - a + err*) 2 

[1-a] 2 ) (r*-l) (16) 



( 2 n i2"\ 



with the initial condition T*(t = 0) = 0. This equation is solved by 

r*(t)=<^ l- ii ^oxp{(l-2a)t} ^ 2 ( 17 ) 

I th if a = § 

which, starting from zero, grows monotonously toward the fixed point given in Eq.[T5J Note that the approach to the 
fixed point is in general exponentially fast, but it slows down algebraically at the critical point, there the asymptotic 
approach happens as t~ l . Note also that, due to the initial condition r*(< = 0) = the dynamics stops as soon as it 
reaches the smallest fixed-point of r*, which corresponds to the selection of the largest number of *, and justifies our 
previous selection of the relevant solution. 
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C. Appearance and size of the typical PER core 

1. Fixed-point equations 

In the last subsection, we have seen that the first extensive PER cores appear continuously at a = 1/2, independently 
of the value of the fraction x of non-canalizing functions. These cores correspond, however, to an exponentially small 
fraction of all external conditions. The average core size remains zero in the thermodynamic limit. We can use the 
self-consistent equation ((HJ) in order to determine the average fraction of regulated nodes in the PER core, 

npER = J VpV[p] p{*) . (18) 

It can also be interpreted as the typical PER-core size, since a randomly chosen external condition leads to this size 
with a probability approaching one in the thermodynamic limit N — > oo. 

As already derived within a slightly different way in [ToL [ll[ this size is given by the self-consistent equation 

1 - npER = (1 - anpER) 2 + (1 - x)(l - anp E ii)anpER ■ (19) 

The first contribution comes from a situation where both parent nodes are not in the core, the second from canalizing 
functions with one canalizing and one core input. This equation has two solutions, the physically relevant one is the 
larger one: 

f if a < 

nPER = { a(1+x) ^ ~ l \ x (20) 
{ xJ lf « > — 

This implies that the PER core typically appears only at a > 1/(1 + x), which goes from a = 1 at s = to a = 1/2 
at x = 1. For purely canalizing networks we thus find that the typical core size remains always zero, even if some rare 
core exist beyond a — 1/2, whereas for purely non-canalizing BNs both threshold coincide since the PER core in this 
case does not depend on the external condition. 



2. Dynamics 



The dynamical evolution of this quantity can be derived following the same procedure as in the last subsection. We 
immediately give the resulting rate equation 

hpER.{t) = [a(l + x) - 1] npER,(t) - a 2 xn PER (t) (21) 

Initially all internal variables are *, so we have the initial condition uper{Q) = 1. This ordinary differential equation 
is readily solved by 

n PER (t) = i max[ ^f M] iia ^Th (22) 
1 if a — 



l+a 2 x t 1+x 



where 



( a (l+x)-l)exp{[a(l + x)-l]t} 

g{t\a,x) = YTTTTa — \ — 1 2\ ( 23 ) 

xa z + [a(l + x) — 1 — xa z \ 

Note that for the subcritical case, the system reaches tiper = after finite time, whereas it reaches its asymptotic 
value exponentially fast in the supercritical phase. Exactly at the transition we see the expected algebraic decay. 



D. The intersection of all PER cores 



As a last special case which can be investigated analytically we look at the intersection of the PER cores for all 
external conditions. The question is: Are there variables never fixed by PER? How many of these variables exist in 
a network? As before, we will first discuss the fixed point equations for the size of the intersection, and derive the 
phase-transition line for a non-trivial size. Than we are going to discuss the dynamics of PER, and how the physically 
relevant solution is selected. 
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1. Fixed-point equations 



We are interested in the variables which are * for all external conditions. Their fraction amongst the regulated 
variables equals 7r*, according to Eq. (|10[) . Let us discuss under which input conditions the output of a generalized 
Boolean function becomes constantly *: 

• N on- canalizing functions: The output is fixed to * if and only if at least one input is regulated and fixed to *. 
This happens with probability [1 — (1 — a7r*) 2 ]. 

• Canalizing functions: Here the situation is a bit more involved. The output is obviously * if both inputs are *. If 
only one input is *, the other one is not allowed to take its canalizing value - i. e. one input is "frozen" to *, the 
other "frozen away" from its canalizing value. The cumulative probability of all these cases is [2a 2 Tr+TQ — a 2 7r 2 ]. 
The negative term removes the double-counting of the case of two *- valued inputs. Note that this is strictly true 
for the AND function having canalizing values equal to zero for both inputs. For the other functions it follows 
from the symmetry condition To = t\. 



This equation depends still on To, i.e., on the probability that a variable is frozen away from x — 0. We consider the 
following cases 

• N on- canalizing functions: There are two types of contributions: In the first one, at least one input is frozen to 
*, i.e. the output becomes * too, and thus differs from zero. The second contribution stems from the situation, 
where non of the variables is frozen to *, we have to avoid any input configuration producing zero (00 and 11 
for XOR, 01 and 10 for its negation). The total probability for this is [1 — (1 — air*) 2 + 2o?(tq — 7r*) 2 ]. 

• Canalizing functions, canalized output equals zero: None of the inputs is allowed to assume the canalizing value, 
which has probability o?Tq. 

• Canalizing functions, canalized output equals one: Only the simultaneous appearance of two non-canalizing 
input variables has to be avoided, the corresponding probability reads [1 — (1 — aro) 2 ]. 

The total equation for tq thus reads 



We thus find 



7T* = X 



[l - (1 - mr*) 2 ] + (1 - x) [2a 2 7r*T - a 2 7r 2 ] 



(24) 



To = x [l — (1 — an*) 2 + 2q: 2 (to — 7T*) 2 ] + (1 — a;)aTo . 



(25) 



Eqs. (|24l25p are two closed quadratic equations, thus they can be solved analytically. The solutions are: 



1. The trivial solution 



7T* = T = 



(26) 



2. A second solution of trivial ir± 



To = 



1 + a(x - 1) 
2a 2 x 



(27) 



3. The potentially physical solution 



tq = 



1 + x - 4x 2 + a(-l - 2.x + 3x 2 + Ax 3 ) + (x - l)D 

2a 2 x(2x 2 - 1) 
1 - 4.t 2 + a(-l -3x + Ax 2 + 8x 3 ) + D 
4a 2 a;(2a; 2 - 1) 



(28) 



with 



D = y/l + a(-2 -6x + 8x 3 ) + a 2 (l + 6x + x 2 - 8x 3 ) 



(29) 
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4. The unphysical solution 

1 + x - Ax 2 + a(-l -2x + 3x 2 + Ax 3 ) - {x - l)D 



7T* = 



2a 2 x{2x 2 - 1) 



1 - 4x 2 + a(—l - 3x + 4x 2 + 8x 3 ) — D 
T ° = 4a 2 x(2x* - 1) (30) 

The first solution is the correct one for small enough a. The third one becomes potentially relevant, when the 
argument under the square root becomes positive, i.e., when D is a positive real number. This point is actually 
relevant for small x, for larger x the corresponding 7T* would be negative. It becomes positive only at a = l/(2x) 
as can be seen easily by linearizing the above equations. To summarize, the non-trivial solution becomes physical 
beyond the phase transition line 



i i e™ i ™2 lOl X ^ g 

for x > i±^S 



a{x)= ) 1+ 6, +f -8*3 (31) 



k 2x 

In this equation, we have already indicated the character of the transition: For x below the tricritical point (1 + 
\/l7)/8 ~ 0.640388, the transition is discontinuous, and the intersection of all PER cores jumps from an empty set to 
a non-zero fraction of all internal vertices. Above this point, a non-trivial intersection appears continuously crossing 
the critical line. In Fig. [5l this analytical result is shown to be well-confirmed by the BP results on a single graph of 
10000 vertices. 




FIG. 5: Fraction of internal variables being in all PER cores: Analytical result (full lines) versus BP results (symbols) on single 
graphs of N = 10000 vertices. 
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2. Dynamics 



As before, the selection of the relevant solution can be justified by the dynamical behavior of PER. Following the 
same arguments as above, we can write down two coupled ordinary differential equations for the evolution of both 7T* 
and To which are expected to be exact in the thermodynamic limit of large BNs. The equations are 

7T* = + X [l - (1 - Cnr*) 2 ] + 

(1 - x) [2aVr - c?nl] 
t = -r + x [1 - (1 - mr*) 2 + 2a 2 (t - tt*) 2 ] + 

(1 - x)aT (32) 

which have to be solved simultaneously under the initial condition 

7T*(t = 0)=ToO = 0) = l (33) 

corresponding to the fact that initially all internal variables are set constantly to *. Unfortunately, we did not find 
a closed analytical form for the solution of Eqs. (|32p , the results of a numerical integration are represented in Fig. [5] 
One can beautifully see the difference between the discontinuous transition at low x via the formation of a finite-hight 
plateau, and the continuous transition at higher fraction x of non-canalizing functions. In Fig.[7]the result of Eqs. ([32]) 
is plotted together with the actual BP dynamics on a single BN of N = 10000 variables, for x = 0.5 and various 
values of a below and above the transition point a(x) ~ 0.88185. Both are completely consistent, as to be expected 
sample-to-sample fluctuations are large close to the transition, such that the self-averaging properties of 7T* and To 
show up clearly only for larger samples. 
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FIG. 6: Time dependence of the number of variables frozen to * under random asynchronous PER, for x — 0.25 (left figure) and 
x — 0.75 (right figure). For small x, the discontinuous character of the transition can be beautifully seen from the appearance of 
a non-zero plateau, whose length diverges at the phase-transition point (a ~ 0.95505 for x = 0.25). The transition is continuous 
for large x, as can be seen in the right figure. 



E. The phase diagram 

The findings are summarized in the phase diagram in Fig. [8l We can distinguish four different regions 

• For a < 0.5 (left of the blue vertical line), all external information propagates throughout the system, with 
probability one no PER core exists. 

• For a between the blue and the black lines, some external conditions lead to a PER core. Almost all conditions, 
however, propagate throughout the full systems, a random external condition leaves almost surely no core. 
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FIG. 7: Comparison of the analytical description (full lines) with realizations in single random BNs of N = 10000 vertices 
(dotted lines). Up to sample-to-sample fluctuations, the results coincide. 

• For a in between the black and the red/green curves, a typical external condition leads to an extensive core. 
The core size starts to grow continuously at the black line, and increases when entering deeper into this phase. 

• For a beyond the red/green curve, all PER cores overlap, the intersection of all cores is non-empty. The 
transition is discontinuous for small x (green line), and continuous for larger x (red line). 



F. The distribution of PER-core sizes 

It is interesting to see directly what the distribution of the PER cores size look like in single instances of BNs. To do 
so we have generated graphs at a = 0.95 for different concentrations of non-canalizing functions x = (0,0.1,0.4) and 
different sizes N — 100,200,400. Given a single instance of the BN one can calculate the PER core sizes generated 
for each of the 2( 1 ~ Q ) Ar configurations of external variables, and finally compute their distribution. Although the time 
needed for computing a PER core is linear in N, we have to explore an exponential of number of external conditions, 
which constrains us to work at relatively small N and at values of a very close to 1. 

In Fig. [9] we display the histogram of the the core size density averaged over 60000 different samples. We also 
display the theoretical value for the average core size as computed already in [ll[ (displayed as a black thick arrows). 
The emerging scenario is coherent with the analytic predictions: 

• For x = we are in the "first cores" phase where each of the external conditions fix almost all the variables as 
indicated by the concentration of the weight of the histogram around (see left panel in Fig. [9]). 

• For x = 0.1 we are in the "typical cores" phase characterized by the formation of the small peak at high values 
of the core size density. Yet many of the external conditions fix almost all the variables as indicated by the 
concentration of the weight of the histogram around (see left panel in Fig. [5]). In this region finite size scaling 
corrections are relevant probably due to the nearby phase transition line. 
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• For x = 0.4 we are in the "all cores overlapping" phase and the histogram peaks nicely close to the theoretical 
mean value. The noisy signal at small core sizes is due to strong sample to sample fluctuations. 



IV. ON THE RANDOM ASYNCHRONOUS DYNAMICS OF BN 



In the previous sections, we have only analyzed the propagation of external regulation. This last section is dedicated 
to the full asynchronous update dynamics: Starting from a random assignment to all variables, in every step one 
function is selected randomly and its output is updated, i.e., it changes its value if and only if the function was 
unsatisfied before. As before, the time t = T/M is measured by rescaling the number of update attempts T by the 
number M of all functions, such that in a time-interval of length one each function is visited once on average. In this 
section we first give an approximate description of the evolution of the number of unsatisfied functions, and at the 
end we relate the dynamics to the PER core studied before. 

A. The energy of canalizing and non-canalizing functions 

In order to characterize the asynchronous dynamics we introduce a Hamiltonian which counts the number of 
unsatisfied Boolean constraints: 

= S a® F a{s ai ,-;S aK ) 

aeINT 

= [(1 - x)e c + xe nc ]M . (34) 

The symbol © stands for the logical XOR operation, i.e. each cost term contributes to the sum if Eq. (fT|) is fulfilled, 
and 1 otherwise. We have divided this energy into contributions coming from canalizing functions, and those from 
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FIG. 9: Distribution of core sizes at a — 0.95 for two different concentration of non-canalizing functions x = 0.0,0.1,0.4 
corresponding respectively to the "first cores" phase (left panel), the "typical core" phase (central panel), and the "all cores 
overlapping" phase (right panel) respectively (see Fig. [8}. The black solid arrows are the theoretical average core density values 
in the thermodynamic limit for the corresponding (a.x) values as computed in [Till. 
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non-canalizing functions. More precisely, e c stands for the fraction of unsatisfied canalizing functions, e nc for the 
fraction of unsatisfied non-canalizing functions. In the beginning, all variables take random values. Each function is 
therefore violated with probability 1/2, 

e c (t = 0) - e nc (t = 0) = ~ . (35) 

Despite some efforts in the last years [5^, [2(| H3, [28, 29], the dynamics of diluted models remains an unsolved problem, 
and we have to restrict our analysis to approximate methods. Following the ideas of [2^ |. and restricting our attention 
to the simplest level of description, we can close the equations for e c (t) and e nc (t) by assuming that at each step all 
configurations of given energy values are equally probable (3lj . This is clearly an approximate assumption, but it 
leads to a good quantitative description of the real dynamics. 

As in the case of the core evolution, the dynamics leads to simple ordinary differential equations. We first write 
them down, and then discuss the meaning of all contributions: 



(l-x)e c = -(1 - x) e c + [(1 - x) e c + xe nc ] a(l - x)(l - 2e c ) 

xe nc = ~xe nc + [(1 - x) e c + xe nc ] 2ax(l - 2e nc ) (36) 

Let us first discuss the first equation. The prefactor (1 — x) on the left-hand side comes from the fact that the 
time is measured with respect to M, whereas e c is a fraction of the (1 — x)M canalizing functions. The first term 
on the right-hand side is the contribution of the selected function itself: With probability (1 — x) it is canalizing, 
and with probability e c it changes from unsatisfied to satisfied, and the energy decreases by one. The second term 
comes from the functions regulated by the output of the updated functions: With probability [(1 — x)e c + xe nc ] 
the selected function was updated, and thus the dependent functions might change status. There are on average 2a 
such functions, a fraction (1 — x) of them being canalizing. They change status only if the other input has not its 
canalizing value (factor 1/2). The energy goes up for the fraction (1 — e c ) of satisfied functions, and down for the 
fraction e c of unsatisfied functions. The second equation is similar, with the main difference that the energy status 
of a non-canalizing function changes always if one input is changed, leading to a relative factor 2 compared to the 
canalizing case. 

These two equations have the obvious fixed point e c — e nc — 0. It corresponds to a fixed point of the microscopic 
dynamics in the sense that all functions are satisfied, and the update dynamics never flips any of the variables. Are 
there also other, positive-energy solutions? Assuming that a non-zero solution appears continuously, we linearize the 
fixed point equation 

= -(1 - x) e c + [(1 - x) e c + xe nc ] a(l - x) 

= — x e nc + [(1 — x) e c + x e nc ] 2ax (37) 

Adding these two equations, we find that the non-zero solution appears at 

1 



1 + x ' 



(38) 



i.e. at the same point where typically the PER core appears. This transition corresponds to critical BN 11]. For 
smaller a, the system fastly approaches a zero-energy fixed point, the basic mechanism being fixation of almost all 
variables by PER. Above the transition, the system settles down at positive energy. At least a part of the core 
variables continues to flip for ever (in the thermodynamic limit). 

To confirm this picture, we have numerically integrated Eqs. (f36|) and in parallel performed direct numerical 
simulations on large BN. The results are shown in Fig.rXUl and confirm the analytical findings. We see, in particular, 
that the simple approximation already gives a quantitatively good description of the dynamics. Following the ideas 
of [27] this result can surely be improved, but doing so goes beyond the scope of the present paper. 



B. Relating PER and dynamics 

In Sec. IIIII we have studied PER also as a dynamical process. In the initial condition all external variables are 
fixed to some configuration in {0, 1} N ~ M , whereas internal variables are initially assigned the joker value *. This 
generalized configuration can be understood as a projection of the set of all 2 aN initial conditions with fixed external 
and changeable internal nodes. This consideration makes clear that (for the fixed external configuration) all variables 
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FIG. 10: Time evolution of the energy: Approximate analytical description (full lines) vs. numerical simulations (symbols) 
on single BNs of N = 10 6 . The inset shows the different evolution of the energy contributions coming from canalizing and 
non-canalizing functions. 

not belonging to the PER core get fixed after 0(N log N) elementary steps. Core variables potentially might go on 
flipping for ever. However, also subsets of core variables could freeze self-consistently due to feedback loops. These 
subsets depend both on the initial condition of the regulated variables and on the realization of random update 
dynamics. 

To investigate this question, we have analyzed the dynamics restricted to the PER core. We first fixed all non-core 
variables by PER, note that this can achieved efficiently for any external condition. Then we set the core variables 
randomly. This slightly atypical choice of initial condition helps to make more evident the dynamics on the core. In 
the asynchronous dynamics, all these variables (or their regulating function) are on average seen equally often, but 
they are flipped if and only if the function was not satisfied before. We record the number of actual variable changes 
for each PER core site, in order to identify frozen versus changeable variables. 

In Fig. [TT] we display the number of actual spin-flips for each variable during a run of 80000 sweeps (each 
sweep consisting of aN attempted spin flips). We measure these quantities at different times, namely after 
t = 10000, 20000,40000 and 80000 sweeps. We consider a random BN with N = 1000, M = aN = 900, x = 0.2. The 
configuration of the external inputs is chosen at random and leads to a PER core of 408 variables. 

We find two types of behavior: In this sample more than half of the variables become frozen at the very beginning 
of the dynamics and change only up to about 10 times. This happens despite the fact that they belong to the core 
and thus are not fixed by PER. Freezing thus has to appear self-consistently due to feedback loops. The second group 
of variables goes on changing. The number of their spin-flips grows proportionally to the number of sweeps, as follows 
from the parallel curves in Fig. 111! After a few sweeps the dynamics becomes concentrated to this second class of 
variables. Note however that the subdivision of core variables into these two classes depends on the initial condition 
and the order of the updates which by definition is random. Both identity and number of frozen spins fluctuate 
from realization to realization, but the qualitative scenario remains the same. It would be very interesting to get a 
quantitative analytical understanding of this phenomenon. 
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FIG. 11: Ranked distribution of the number of elementary spin-updates of the set of variables belonging to the PER core, for 
N = 1000, M — 900, x — 0.2 and up to 80000 asynchronous updates sweeps (see text). The PER core in this case is made of 
408 variables. The 250 first-ranking spins fix their value after ~ 10 steps while the remaining spins keep changing their values. 



V. CONCLUSIONS 

In this work we have studied the propagation of external regulatory information into a random Boolean network. 
The efficiency of this propagation depends on two control parameters: The fraction (1 — a) of non-regulated input 
variables in between all Boolean variables, and the fraction x of non-canalizing variables. 

We find that for a < 1/2 the external condition propagates through the full BN and fixes efficiently all variables. 
For higher a, a fraction of variables remain unfixed by this process, forming thus the PER core of the BN. The precise 
PER core depends on the external condition. We have studied the fluctuation of the PER core size between different 
external conditions. Only for a > 1/(1 + x) the PER core is not empty for almost all external conditions, for even 
higher a all PER cores start to overlap. 

The notion of the PER core is intimately related to the dynamics of the system. After very short time all non-core 
variables become frozen, so only core variables can be considered as the true dynamical degrees of freedom of the 
BN under a specified external configuration. However, also a part of the core variables becomes dynamically frozen 
during the dynamical evolution of the system, the reason being self-consistently fixed feedback loops. The dynamics 
of the system for longer time scales becomes concentrated on the remaining variables. 

In a future work, we plan to extend the PER analysis also to the analytical calculation of the PER core size 
distribution, which so far we obtained only numerically for small BNs. In this context it would be interesting to 
algorithmically identify those external conditions which lead to largest or smallest cores. This would help to solve the 
inverse problem of determining the external conditions leading to a specific core. 

A second interesting extension of the current work would be a deepened analysis of the dynamical behavior of this 
model. In distinction to other models on diluted networks, including in particular diluted spin glasses, BN are defined 
on directed graphs. This leads to some technical simplifications which we hope to open new ways in the understanding 
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of their dynamical behavior. 
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